Learning and synthesizing MPEG-4 compatible 3-D face animation from video sequence
نویسندگان
چکیده
In this paper, we present a new system that applies an example-based learning method to learn facial motion patterns from a video sequence of individual facial behavior such as lip motion and facial expressions, and using that to create vivid threedimensional (3-D) face animation according to the definition of MPEG-4 face animation parameters. The system consists of three key modules, face tracking, pattern learning, and face animation. In face tracking, to reduce the complexity of the tracking process, a novel coarse-to-fine strategy combined with a Kalman filter is proposed for localizing key facial landmarks in each image of the video. The landmarks’ sequence is normalized into a visual feature matrix and then fed to the next step of process. In pattern learning, in the pretraining stage, the parameters of the camera that took the video are requested with the training video data so the system can estimate the basic mapping from a normalized two-dimensional (2-D) visual feature matrix to the representation in 3-D MPEG-4 face animation parameter space, in assistance with the computer vision method. In the practice stage, considering that in most cases camera parameters are not provided with video data, the system uses machine learning technology to complement the incomplete 3-D information for the mapping that information is needed in face orientation presentation. The example-based learning in this system integrates several methods including clustering, HMM, and ANN to make a better conversion from a 2-D to 3-D model and better estimation of incomplete 3-D information for good mapping; this will be used to drive face animation thereafter. In face animation, the system can synthesize face animation following any type of face motion in video. Experiments show that our system produces more vivid face motion animation, compared to other early systems.
منابع مشابه
A Speech Driven Face Animation System Based on Machine Learning
Lip synchronization is the key issue in speech driven face animation system. In this paper, some clustering and machine learning methods are combined together to estimate face animation parameters from audio sequences and then apply the learning results to MPEG-4 based speech driven face animation system. Based on a large recorded audio-visual database, an unsupervised cluster algorithm is prop...
متن کاملShowface Mpeg-4 Compatible Face Animation Framework
This paper proposes ShowFace, a streaming structure for personalized face animation. This structure extends (and complies with) MPEG-4 face animation by defining image transformations required for certain facial movements, and also by providing a higher level Face Modeling Language (FML) for modeling and control purposes. The proposed framework consists of components for parsing the input scrip...
متن کاملMPEG-4 based animation with face feature tracking
Human interfaces for computer graphics systems are now evolving towards multi-modal approach. Information gathered using visual, audio and motion capture systems are now becoming increasingly important within user-controlled virtual environments. This paper discusses real-time interaction with virtual world through the visual analysis of human facial features from video. The underlying approach...
متن کاملModel-Based Eye Detection and Animation
In this thesis we present a system to extract the eye motion from a video stream containing a human face and applying this eye motion into a virtual character. By the notation eye motion estimation, we mean the information which describes the location of the eyes in each frame of the video stream. Applying this eye motion estimation into a virtual character, we achieve that the virtual face mov...
متن کاملMPEG-4 Compatible Faces from Orthogonal Photos
MPEG-4 is scheduled to become an International Standard in March 1999. This paper demonstrates an experiment for a virtual cloning method and animation system, which is compatible with the MPEG-4 standard facial object specification. Our method uses orthogonal photos (front and side view) as input and reconstructs the 3D facial model. The method is based on extracting MPEG-4 face definition par...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Circuits Syst. Video Techn.
دوره 13 شماره
صفحات -
تاریخ انتشار 2003